Interaction between High-Level and Low-Level Image Analysis for Semantic Video Object Extraction
نویسندگان
چکیده
The task of extracting a semantic video object is split into two subproblems, namely, object segmentation and region segmentation. Object segmentation relies on a priori assumptions, whereas region segmentation is data-driven and can be solved in an automatic manner. These two subproblems are not mutually independent, and they can benefit from interactions with each other. In this paper, a framework for such interaction is formulated. This representation scheme based on region segmentation and semantic segmentation is compatible with the view that image analysis and scene understanding problems can be decomposed into lowlevel and high-level tasks. Low-level tasks pertain to region-oriented processing, whereas the high-level tasks are closely related to object-level processing. This approach emulates the human visual system: what one “sees” in a scene depends on the scene itself (region segmentation) as well as on the cognitive task (semantic segmentation) at hand. The higher-level segmentation results in a partition corresponding to semantic video objects. Semantic video objects do not usually have invariant physical properties and the definition depends on the application. Hence, the definition incorporates complex domain-specific knowledge and is not easy to generalize. For the specific implementation used in this paper, motion is used as a clue to semantic information. In this framework, an automatic algorithm is presented for computing the semantic partition based on color change detection. The change detection strategy is designed to be immune to the sensor noise and local illumination variations. The lower-level segmentation identifies the partition corresponding to perceptually uniform regions. These regions are derived by clustering in an N-dimensional feature space, composed of static as well as dynamic image attributes. We propose an interaction mechanism between the semantic and the region partitions which allows to cope with multiple simultaneous objects. Experimental results show that the proposed method extracts semantic video objects with high spatial accuracy and temporal coherence.
منابع مشابه
Content-based Retrieval of Digital Video
In this proposal a new method for content-based retrieval of digital video is proposed based on a neurophysiological model. Effective content-based retrieval requires access to video objects which represent syntactic structures in a video sequence such as people, cars, shots and scenes. Existing techniques for extracting video objects are based on inverse optics, where three-dimensional objects...
متن کاملSemiautomatic Image Retrieval Using the High Level Semantic Labels
Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملCombining Motion Understanding and Keyframe Image Analysis for Broadcast Video Information Extraction
We describe a robust new approach to extract semantic concept information based on explicitly encoding static image appearance features together with motion information. For high-level semantic concept identification detection in broadcast video, we trained multi-modality classifiers which combine the traditional static image features and a new motion feature analysis method (MoSIFT). The exper...
متن کاملSemantic Video Analysis
OVERVIEW The objective of this component is to index videos based on semantic mid to high-level features. To achieve this, the component integrates different modules for video processing. As shown in the diagram, the component integrates the following components in order to extract the embedded semantics from the video: shot boundary detection for categorising shots with similar attributes; key...
متن کاملRecognition of Visual Events using Spatio-Temporal Information of the Video Signal
Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- EURASIP J. Adv. Sig. Proc.
دوره 2004 شماره
صفحات -
تاریخ انتشار 2004